Skip to content

Rename image_exists to remote_image_exists#471

Merged
simonrosenberg merged 1 commit intomainfrom
rename-image-exists-to-remote
Mar 2, 2026
Merged

Rename image_exists to remote_image_exists#471
simonrosenberg merged 1 commit intomainfrom
rename-image-exists-to-remote

Conversation

@simonrosenberg
Copy link
Copy Markdown
Collaborator

Summary

  • Renamed image_existsremote_image_exists in benchmarks/utils/image_utils.py for clarity: the function only checks remote registries, while local_image_exists was already named correctly
  • Updated all 10 import and call sites across the codebase
  • Added a one-line docstring to the function

Split out from #455.

CI validation

All benchmarks pass with this rename (validated in the parent PR):

Benchmark Status Run
gaia Pass https://github.com/OpenHands/evaluation/actions/runs/22590769837
swebench Pass https://github.com/OpenHands/evaluation/actions/runs/22590769394
swtbench Pass https://github.com/OpenHands/evaluation/actions/runs/22590775404
commit0 Pass https://github.com/OpenHands/evaluation/actions/runs/22590778789
swebenchmultimodal Pass https://github.com/OpenHands/evaluation/actions/runs/22590785406

Test plan

  • Verify no remaining references to the old image_exists name (grep -r "image_exists" --include="*.py" returns only local_image_exists and remote_image_exists)
  • All call sites updated with the new name

🤖 Generated with Claude Code

The function only checks remote registries, while local_image_exists
was already named correctly. This rename makes the distinction explicit.

Updated the definition in image_utils.py and all 10 import/call sites.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste

This is exactly what a refactoring should be: simple, complete, and solves a real problem.

KEY INSIGHT: The rename from image_exists to remote_image_exists eliminates naming ambiguity when paired with local_image_exists. The relationship is now immediately clear without needing to read the implementation.

Why this works:

  • Solves actual maintainability issue (naming confusion)
  • Complete and systematic (all 10 call sites updated)
  • No complexity added (pure rename + docstring)
  • Already validated by CI

Particularly nice: build_eval_env_images.py previously needed import image_exists as remote_image_exists as a workaround. Now it just imports the correctly-named function directly.

Worth merging: Clean refactoring with no issues.

@simonrosenberg simonrosenberg merged commit 92efb47 into main Mar 2, 2026
3 checks passed
simonrosenberg added a commit that referenced this pull request Mar 3, 2026
Resolve conflicts after sub-PRs #471, #472, #473 were merged to main:
- Take main's SDK_SHORT_SHA deprecation handling in version.py
- Take main's backward-compat SDK_SHORT_SHA in modal_patches.py
- Take main's log message wording in swebench/run_infer.py
- Remove redundant prompt_dir setup (superseded by add_prompt_path_argument)
- Keep lazy imports for git-dependent modules (build_utils.py, swtbench/run_infer.py)
- Take main's comment cleanup in swtbench/eval_infer.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
juanmichelini pushed a commit that referenced this pull request Mar 3, 2026
Root Cause:
PR #471 (commit 2a7ff00) renamed image_exists() to remote_image_exists()
and removed the local Docker check on March 2, 2026. This caused every
build_image() call to make HTTP requests to GHCR, creating 500+ concurrent
requests when building large batches (e.g., all SWE-Bench images).

Impact:
- Small builds (1-50 images): Manageable HTTP load, still works
- Large builds (500 images): Overwhelming HTTP load, causes 5+ hour hangs
- Last successful 500-image build: January 8, 2026
- First stuck builds: March 3, 2026 (immediately after the PR merged)

Fix:
Restore local_image_exists() check BEFORE remote_image_exists() check
in build_image(). This ensures:
1. Fast local Docker checks happen first (<100ms per image)
2. Remote registry checks only for truly missing images
3. Batch builds complete in reasonable time (5-40 minutes)

Testing:
- All 14 tests in tests/test_image_utils.py pass
- Pre-commit checks pass (ruff, pycodestyle, pyright)

Fixes #476

Co-authored-by: openhands <openhands@all-hands.dev>
juanmichelini pushed a commit that referenced this pull request Mar 4, 2026
Root Cause:
PR #471 (commit 92efb47) renamed image_exists() to remote_image_exists()
and removed the local Docker check on March 2, 2026. This caused every
build_image() call to make HTTP requests to GHCR, creating 500+ concurrent
requests when building large batches (e.g., all SWE-Bench images).

Impact:
- Small builds (1-50 images): Manageable HTTP load, still works
- Large builds (500 images): Overwhelming HTTP load, causes 5+ hour hangs
- Last successful 500-image build: January 8, 2026
- First stuck builds: March 3, 2026 (immediately after the PR merged)

Fix:
Restore local_image_exists() check BEFORE remote_image_exists() check
in build_image(). This ensures:
1. Fast local Docker checks happen first (<100ms per image)
2. Remote registry checks only for truly missing images
3. Batch builds complete in reasonable time (5-40 minutes)

Testing:
- All 14 tests in tests/test_image_utils.py pass
- Pre-commit checks pass (ruff, pycodestyle, pyright)

Fixes #476

Co-authored-by: openhands <openhands@all-hands.dev>
@simonrosenberg simonrosenberg deleted the rename-image-exists-to-remote branch March 4, 2026 19:25
@simonrosenberg simonrosenberg self-assigned this Mar 4, 2026
simonrosenberg pushed a commit that referenced this pull request Mar 4, 2026
After investigation based on reviewer feedback:
- PR #471 only renamed functions, did not remove any local Docker check
- The image_exists function was always remote-only (HTTP requests)
- March 3rd stuck builds were transient - builds before/after succeeded
- Builds are now working without this fix being merged

This change is now correctly framed as a performance optimization:
- Adds local Docker check before remote registry check
- Skips HTTP requests when images exist locally
- Useful for local development and re-runs

Co-authored-by: openhands <openhands@all-hands.dev>
juanmichelini pushed a commit that referenced this pull request Mar 11, 2026
The revert of 2bfcc6c correctly removed the functions added in that commit,
but didn't account for the subsequent rename of image_exists to remote_image_exists
in commit 92efb47 (#471).

Since we're reverting to the pre-2bfcc6c state, the function should be called
image_exists, not remote_image_exists.

Changes:
- benchmarks/swebench/build_images.py: import and use image_exists
- benchmarks/gaia/build_images.py: import and use image_exists
- benchmarks/swtbench/build_eval_env_images.py: use alias (as it was pre-2bfcc6c)

This fixes the ImportError that was causing SWE-bench builds to fail.

Co-authored-by: openhands <openhands@all-hands.dev>
mkaramuk added a commit to mkaramuk/benchmarks that referenced this pull request Apr 8, 2026
The function was renamed in OpenHands#471 but the swesmith scaffold still
referenced the old name, breaking the pyright pre-commit check.
mkaramuk added a commit to mkaramuk/benchmarks that referenced this pull request Apr 13, 2026
The function was renamed in OpenHands#471 but the swesmith scaffold still
referenced the old name, breaking the pyright pre-commit check.
juanmichelini pushed a commit that referenced this pull request Apr 14, 2026
* feat(swesmith): add SWE-Smith benchmark scaffold

Add core inference and evaluation scripts for running OpenHands agents
on SWE-Smith task instances.

Co-Authored-By: Muhammed Karamuk <mkaramuk@proton.me>

* docs(swesmith): add prompt template, README, and env example

Co-Authored-By: Muhammed Karamuk <mkaramuk@proton.me>

* fix(utils): support .json files in dataset loader

Co-Authored-By: Rb <rubenwolff@gmail.com>

* fix: update swesmith dep to PyPI release and adapt to n_critic_runs rename

Switch swesmith dependency from local path to PyPI (>=0.0.9) and add
missing script entry points. Rename max_attempts to n_critic_runs to
match upstream breaking change.

* fix(swesmith): rename image_exists to remote_image_exists

The function was renamed in #471 but the swesmith scaffold still
referenced the old name, breaking the pyright pre-commit check.

* test(swesmith): add test data and mocks for metrics collection test

The auto-discovered swesmith benchmark fell into generic fallbacks in
test_benchmark_metrics_collection that lacked required fields, causing
test failures.

- Add swesmith branch to _get_test_instance_for_benchmark with repo,
  image_name, base_commit, and problem_statement
- Add swesmith branch to _create_metadata_for_benchmark with prompt_path
  and details
- Add swesmith-specific patches for run_conversation_with_fake_user_response,
  build_event_persistence_callback, and _find_ssh_key

* fix(swesmith): use swesmith registry for mirror URL with public/private detection

The previous logic chose SSH vs HTTPS based solely on whether an SSH key
existed on disk, which failed when users had a key unrelated to the
swesmith mirror org. Delegating to swesmith's own profile.mirror_url
makes the URL choice driven by an actual GitHub API check of the mirror
repo's visibility, so public mirrors use HTTPS unconditionally.

- Import swesmith.profiles.registry and register custom profiles
- Replace manual URL construction with profile.mirror_url
- Patch registry.get_from_inst in test_metrics.py to decouple the test
  from real profile lookup

---------

Co-authored-by: Rb <rubenwolff@gmail.com>
GaokaiZhang pushed a commit to GaokaiZhang/benchmarks that referenced this pull request Apr 17, 2026
…s#428)

* feat(swesmith): add SWE-Smith benchmark scaffold

Add core inference and evaluation scripts for running OpenHands agents
on SWE-Smith task instances.

Co-Authored-By: Muhammed Karamuk <mkaramuk@proton.me>

* docs(swesmith): add prompt template, README, and env example

Co-Authored-By: Muhammed Karamuk <mkaramuk@proton.me>

* fix(utils): support .json files in dataset loader

Co-Authored-By: Rb <rubenwolff@gmail.com>

* fix: update swesmith dep to PyPI release and adapt to n_critic_runs rename

Switch swesmith dependency from local path to PyPI (>=0.0.9) and add
missing script entry points. Rename max_attempts to n_critic_runs to
match upstream breaking change.

* fix(swesmith): rename image_exists to remote_image_exists

The function was renamed in OpenHands#471 but the swesmith scaffold still
referenced the old name, breaking the pyright pre-commit check.

* test(swesmith): add test data and mocks for metrics collection test

The auto-discovered swesmith benchmark fell into generic fallbacks in
test_benchmark_metrics_collection that lacked required fields, causing
test failures.

- Add swesmith branch to _get_test_instance_for_benchmark with repo,
  image_name, base_commit, and problem_statement
- Add swesmith branch to _create_metadata_for_benchmark with prompt_path
  and details
- Add swesmith-specific patches for run_conversation_with_fake_user_response,
  build_event_persistence_callback, and _find_ssh_key

* fix(swesmith): use swesmith registry for mirror URL with public/private detection

The previous logic chose SSH vs HTTPS based solely on whether an SSH key
existed on disk, which failed when users had a key unrelated to the
swesmith mirror org. Delegating to swesmith's own profile.mirror_url
makes the URL choice driven by an actual GitHub API check of the mirror
repo's visibility, so public mirrors use HTTPS unconditionally.

- Import swesmith.profiles.registry and register custom profiles
- Replace manual URL construction with profile.mirror_url
- Patch registry.get_from_inst in test_metrics.py to decouple the test
  from real profile lookup

---------

Co-authored-by: Rb <rubenwolff@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants